3-D Face Point Trajectory Synthesis Using An Automatically Derived Visual Phoneme Similarity Matrix

نویسندگان

Levent M. Arslan

David Talkin

چکیده

This paper presents a novel algorithm which generates three-dimensional face point trajectories for a given speech le with or without its text. The proposed algorithm rst employs an o -line training phase. In this phase, recorded face point trajectories along with their speech data and phonetic labels are used to generate phonetic codebooks. These codebooks consist of both acoustic and visual features. Acoustics are represented by line spectral frequencies (LSF), and face points are represented with their principal components (PC). During the synthesis stage, speech input is rated in terms of its similarity to the codebook entries. Based on the similarity, each codebook entry is assigned a weighting coe cient. If the phonetic information about the test speech is available, this is utilized in restricting the codebook search to only several codebook entries which are visually closest to the current phoneme (a visual phoneme similarity matrix is generated for this purpose). Then these weights are used to synthesize the principal components of the face point trajectory. The performance of the algorithm is tested on held-out data, and the synthesized face point trajectories showed a correlation of 0.73 with true face point trajectories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech driven 3-d face point trajectory synthesis algorithm

متن کامل

Codebook Based Face Point Trajectory Synthesis Algo - rithm Using Speech

متن کامل

A New Vision-Based and GPS-Signal-Independent Approach in Jamming Detection and UAV Absolute Positioning Assessment

The Unmanned Aerial Vehicles (UAV) positioning in the outdoor environment is usually done by the Global Positioning System (GPS). Due to the low power of the GPS signal at the earth surface, its performance disrupted in the contaminated environments with the jamming attacks. The UAV positioning and its accuracy using GPS will be degraded in the jamming attacks. A positioning error about tens of...

متن کامل

Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling

Long audio alignment systems for Spanish and English are presented, within an automatic subtitling application. Language-specific phone decoders automatically recognize audio contents at phoneme level. At the same time, language-dependent grapheme-to-phoneme modules perform a transcription of the script for the audio. A dynamic programming algorithm (Hirschberg's algorithm) finds matches betwee...

متن کامل

Additional use of phoneme duration hypotheses in automatic speech segmentation

In this paper, we describe a new approach for speaker independent automatic phoneme alignment. Typical algorithms for this task use only phoneme-to-frame similarity measures which are somehow maximised or minimised. In addition to such similarity measures, we use phoneme duration hypotheses generated by the speech synthesis system HADIFIX [1]. For algorithms based on dynamic programming, it is ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

3-D Face Point Trajectory Synthesis Using An Automatically Derived Visual Phoneme Similarity Matrix

نویسندگان

چکیده

منابع مشابه

Speech driven 3-d face point trajectory synthesis algorithm

Codebook Based Face Point Trajectory Synthesis Algo - rithm Using Speech

A New Vision-Based and GPS-Signal-Independent Approach in Jamming Detection and UAV Absolute Positioning Assessment

Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling

Additional use of phoneme duration hypotheses in automatic speech segmentation

عنوان ژورنال:

اشتراک گذاری